Verifying Fault-Tolerant Distributed Algorithms in the Heard-Of Model

نویسندگان

Henri Debrat

Stephan Merz

چکیده

Distributed computing is inherently based on replication, promising increased tolerance to failures of individual computing nodes or communication channels. Realizing this promise, however, involves quite subtle algorithmic mechanisms, and requires precise statements about the kinds and numbers of faults that an algorithm tolerates (such as process crashes, communication faults or corrupted values). The landmark theorem due to Fischer, Lynch, and Paterson shows that it is impossible to achieve Consensus among N asynchronously communicating nodes in the presence of even a single permanent failure. Existing solutions must rely on assumptions of “partial synchrony”. Indeed, there have been numerous misunderstandings on what exactly a given algorithm is supposed to realize in what kinds of environments. Moreover, the abundance of subtly different computational models complicates comparisons between different algorithms. Charron-Bost and Schiper introduced the Heard-Of model for representing algorithms and failure assumptions in a uniform framework, simplifying comparisons between algorithms. In this contribution, we represent the Heard-Of model in Isabelle/HOL. We define two semantics of runs of algorithms with different unit of atomicity and relate these through a reduction theorem that allows us to verify algorithms in the coarse-grained semantics (where proofs are easier) and infer their correctness for the fine-grained one (which corresponds to actual executions). We instantiate the framework by verifying six Consensus algorithms that differ in the underlying algorithmic mechanisms and the kinds of faults they tolerate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model Checking of Consensus Algorithms

We show for the first time that standard model checking allows one to completely verify asynchronous algorithms for solving consensus, a fundamental problem in fault-tolerant distributed computing. Model checking is a powerful verification methodology based on state exploration. However it has rarely been applied to consensus algorithms, because these algorithms induce huge, often infinite stat...

متن کامل

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

Quantum-dot Cellular Automata (QCA) is an emerging and promising technology that provides significant improvements over CMOS. Recently QCA has been advocated as an applicant for implementing reversible circuits. However QCA, like other Nanotechnologies, suffers from a high fault rate. The main purpose of this paper is to develop a fault tolerant model of QCA circuits by redundancy in hardware a...

متن کامل

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

متن کامل

What You Always Wanted to Know About Model Checking of Fault-Tolerant Distributed Algorithms

Distributed algorithms have numerous mission-critical applications in embedded avionic and automotive systems, cloud computing, computer networks, hardware design, and the internet of things. Although distributed algorithms exhibit complex interactions with their computing environment and are difficult to understand for human engineers, computer science has developed only very limited tool supp...

متن کامل

Synthesis of Distributed Algorithms with Parameterized Threshold Guards

Fault-tolerant distributed algorithms are notoriously hard to get right. In this paper we introduce an automated method that helps in that process: the designer provides specifications (the problem to be solved) and a sketch of a distributed algorithm that keeps arithmetic details unspecified. Our tool then automatically fills the missing parts. Fault-tolerant distributed algorithms are typical...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Archive of Formal Proofs

دوره 2012 شماره

صفحات -

تاریخ انتشار 2012

Verifying Fault-Tolerant Distributed Algorithms in the Heard-Of Model

نویسندگان

چکیده

منابع مشابه

Model Checking of Consensus Algorithms

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

Fault Tolerant Reversible QCA Design using TMR and Fault Detecting by a Comparator Circuit

What You Always Wanted to Know About Model Checking of Fault-Tolerant Distributed Algorithms

Synthesis of Distributed Algorithms with Parameterized Threshold Guards

عنوان ژورنال:

اشتراک گذاری